#Apple Silicon

49 articles

TechMay 14, 202629 min

oMLX 0.3.9.dev2 tested on M1 Max 64GB: SSD cache wins, VLM MTP slower

Tested oMLX 0.3.9.dev2 on M1 Max 64GB across 11 scenarios: SSD KV cache cuts Copilot prefill 88s→33s, VLM MTP slows decode 12-30%, omlx launch reaches Copilot/Codex/Claude Code.

AI LLM Local LLM Apple Silicon MLX Inference Optimization Codex 実験

TechMay 13, 202612 min

Quake III's Q_rsqrt on Apple M4 vs Zen 3: when it actually beats 1/sqrtf in 2026

Tested Q_rsqrt on Apple M4 (Mac mini) and Zen 3 (Ryzen 5800HS / WSL2). M4's -O2 already rewrites 1/sqrtf to frsqrte and ties Q_rsqrt; x86 clang needs -ffast-math or hits a 12x gap. Hand-written NEON/SSE wrappers turn out slower. Newton 0/1/2 error and the Lomont constant covered too.

C アルゴリズムベンチマーク Apple Silicon 数学ゲーム開発実験

TechMay 13, 2026updated8 min

oMLX 0.3.9.dev2 for Mac coding agents: Gemma 4 VLM MTP, DFlash, launch copilot

oMLX 0.3.9.dev2 release notes from the angle of Codex/Copilot on Mac local LLMs: Gemma 4 VLM MTP, DFlash, omlx launch copilot, SSD KV cache — what each changes for agent workflows.

AI LLM Local LLM Apple Silicon MLX Inference Optimization Codex

TechMay 8, 202611 min

FLUX.2 Klein 9B + NSFW LoRA on M1 Max 64GB via mflux: 1m51s/512, 5m37s/1024 q4

Tested Klein 9B + 9B NSFW LoRA on M1 Max 64GB via mflux 0.17.5: 1m51s/512, 5m37s/1024 q4, 224/224 LoRA keys match, NSFW prompts uncensored, Japanese subjects work with helper tokens.

AI 画像生成 FLUX Apple Silicon Mac MLX LoRA 実験

TechMay 4, 2026updated13 min

FLUX.2 Klein NSFW LoRA on M1 Max: why a 9B LoRA won't load on 4B mflux (variant compatibility map)

Klein 4B / 9B / Base LoRAs aren't cross-compatible — a 9B NSFW LoRA throws 'lora key not loaded' on mflux's 4B path. The variant map, what mflux runs today, and where the working hands-on test lives.

AI 画像生成 FLUX Apple Silicon Mac MLX LoRA 実験

TechMay 3, 202610 min

A FastAPI wrapper that takes Japanese, runs it through Ollama, and routes to ComfyUI or mflux to drive Anima, WAI-IL, and FLUX.2 Klein from one WebUI

Three local image generation engines (WAI-Anima, WAI-IL/SDXL, FLUX.2 Klein 4B) tied together by a thin FastAPI wrapper that takes Japanese prompts. Ollama (gemma3:12b) handles JP→EN, ComfyUI workflows are built on the fly in Python, FLUX.2 runs as an mflux subprocess, and the whole thing is reachable from an iPhone over Tailscale.

AI 画像生成 ComfyUI FLUX Apple Silicon Mac Ollama FastAPI Tailscale 実験

TechMay 2, 202623 min

Wiring Up a Multimodal Japanese Local RAG with FastAPI, Chroma, Open WebUI, and Ollama on M1 Max

Hands-on log of building the DEV article's PDF RAG on M1 Max 64GB, extending it with images via CLIP, and pushing through Japanese with bge-m3 + Qwen3.6 35B. Documents the modality gap, the dual inference server crash, and LLM-jp 4-8B's empty chat template silently dropping the system role.

AI LLM RAG ローカルLLM FastAPI llama.cpp Chroma Python Apple Silicon Ollama 日本語LLM 実験

TechMay 2, 202620 min

Running Qwen-Scope's SAE on M1 Max 64GB to Extract a Japanese-Language Feature

A hands-on log of running Qwen-Scope's Sparse Autoencoder locally on M1 Max 64GB with Qwen3-8B-Base, extracting feature IDs that discriminate between Japanese, English, code, and Chinese from a single middle layer.

AI LLM Qwen 解釈可能性実験 Apple Silicon MPS

TechApr 30, 2026updated12 min

FLUX.2 Klein 4B benchmarked on M1 Max with mflux vs iris.c

Hands-on benchmark of FLUX.2 Klein 4B on M1 Max 64GB using mflux (MLX) and iris.c (pure C + Metal). A counter to Pruna AI's H100-only tutorial — measuring how fast Apple Silicon actually gets there.

AI 画像生成 FLUX Apple Silicon Mac MLX 実験

TechApr 30, 20269 min

Can Xiaomi MiMo-V2.5 actually run on a Mac or ROCm?

After Xiaomi MiMo-V2.5's weights went public, I checked whether it runs on Mac/ROCm or on cloud GPU (RunPod/GCE). It's still rough on local hardware, but RunPod's 4x H200 runs it for ~$14/hr and GCE Spot H100 brings it down to ~$1.6/hr.

AI LLM Local LLM Xiaomi MoE Apple Silicon ROCm llama.cpp

TechApr 29, 2026updated15 min

Z-Anime turned out to be an anime-focused full fine-tune of Z-Image

Confirmed SeeSee21/Z-Anime is a full fine-tune of Z-Image Base, then ran the AIO version on local ComfyUI on an M1 Max 64GB to verify t2i, i2i, and how NSFW prompts pass through.

AI 画像生成 Z-Image ComfyUI Apple Silicon 実験

TechApr 29, 202617 min

Converting AI Illustrations to Manga BW with Screentone Instead of Grayscale

A verification log for converting color anime-style AI illustrations to manga-style monochrome. AI re-generation approaches lean to either color leakage or face drift, and pure deterministic local processing looks mechanical. Frames the next directions to try: putting a grayscale-only LoRA on Anima, and using See-through for part decomposition before mechanical composition.

AI 画像生成 ComfyUI LoRA 漫画スクリーントーン実験 Apple Silicon SDXL Qwen Z-Image Anima